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A method and system for software and hardware multiplicity 



L DESCRIPTION 

UL Field 

This disclosure generally teaches a method and system for deploying and 
executing software objects in a network of processors. The disclosed techniques are 
embodied In computer platform systems, methods for creating a computer platform 
and related computer program products. 

LB. Background 

Conventional large-scale software systems often exceed the limits of the 
computational power that a single processor or machine can provide. It is therefore 
desirable to create software systems capable of running across many processors or 
machines. For the most part, the emphasis in conventional systems has revolved 
around hardware and operating system level solutions. But with widespread use 
and availability of the Internet and ubiquity of networking, the abstraction level has 
moved up. This has led to research and development of systems based on remote 
objects. 

While implementations of remote objects have varied, the general approach 
Involves, for example; marshalling objects, method invocations, and data attributes 
into a byte stream for transit across a network, the byte stream representing the 
remote object is then demarshalled by another machine and executed 'remotely/ 
Common implementations of such remote object systems Include: DCE, CORBA, 
XML, COM/DCOM, DOM, SOAP, and RMI. These technologies enable machines on a 
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network to execute code remotely located on another machine. This in turn opens 
possibilities for the creation of scalable systems. 

In practice, the number of machines (and therefore the scalability) of such 
systems is often limited to less than a dozen. This is primarily due to the 
interdependencies between layers of software, ""synchronization" of stubs and 
skeletons, and the need to spread the system load efficiently across these 
machines. This Is further complicated by the difficulty of properly configuring a 
large number of systems and deploying the correct versions of software across all 
machines. Conventional systems using this abstraction layer (remote objects) 
additionally suffer from the lack of open, flexible, and unified automation controlling 
resources, objects and redundancy within the system. 

The end result is that most conventional remote object systems In place 
today suffer from outages caused by improper configuration or deployment of 
software and hardware. Additional outages are incurred (and considered normal) 
during the upgrade or replacement of software and hardware elements. These 
upgrades are often performed at 'off hours (evenings/weekends) when system 
availability is considered less critical. With the advent of ubiquitous computing and 
the growing importance of the Internet, however; such outages are no longer 
tolerated as customers and vendors now expect true 24 x 7 system availability. 

IL SUMMARY 

To overcome the above problems in conventional technologies, an object of 
the present teaching Is to provide a solution that enables systems to remain highly 
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available at all times and eliminates the need to disable system access during 
upgrades and maintenance. 

To meet the objectives of the disclosed teaching, a method for creating a 
scalable, fault tolerant computing platform on a networi<, said platform comprising 
a plurality of nodes, said method comprising initiating melding during Initialization, 
wherein during said melding said plurality of nodes are treated as a collection and 
each node from said plurality of nodes joins said collection, and during melding at 
least a first object router is assigned; migrating objects between said plurality of 
nodes; and replicating objects across said plurality of nodes. 

Preferably, during melding each node joins the collection using a process 
comprising: the node finding a nearby object router if the object router can be 
found, otherwise becoming the object router; and the node reporting available 
resources to the object router. 

Preferably, during melding each node joins the collection using a process 
further comprising assigning network address for the node; and updating 
components of the node. 

Preferably, the available resources comprise memory, and at least one 

processor. 

Still preferably, the available resource comprises network bandwidth. 
Still preferably, the available resources comprise drive storage space. 
Still preferably, said updating Is done using data stored in a code version 
server. 

Still preferably, It is ensured that at least one mirror of an object router 
exists all the time. 

3 
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Preferably, the object router manages resources and distributes load evenly 
across the plurality of nodes. 

Preferably, the object router maintains knowledge of locations of at least a 
subset of all nodes In the computing platform. 

Preferably, the object router relocates or replicates objects within the 
plurality of nodes. 

Preferably, said migration of an object is performed using a process 
comprising locking the object; converting the object to a byte stream at a first node 
in the plurality of nodes; transferring the byte stream across the network; 
converting the byte stream back to the object at a second node in the plurality of 
nodes; updating a location of the object In the object router; and unlocking the 
object. 

Preferably, said replication of an object is performed by a process 
comprising: locking the object at a first node; forming a duplicate object ; locating 
the duplicate object at a second node; and unlocking the object. 

Still preferably, the method further comprises subscribing to events 
associated with the object. 

Still preferably, said duplication is performed by creating a byte stream from 
the object and converting said byte stream into a duplicate object. 

Preferably, the computing platform allows for evolution, wherein said 
evolution provides for transitioning of system software and hardware such that a 
seamless upgrade is possible. 

Another aspect of the disclosed teachings is a method for performing 

evolution, said evolution being performed to enable a seamless transitioning of 
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software, said method comprising: evaluating a software or hardware to be 
Installed or upgraded; creating a dependency hierarchy tree of all objects in the 
computing platform at compile time; performing object class diff comparison 
between old and new versions of at least one class of objects; creating a transform 
function for each state variable that is changed; selecting an object that is an 
instance of the selected object class; locking the selected object; creating 
uninitialized instance of the new version of the selected object; initializing 
variables of the new object class instance by applying appropriate transform 
functions to the old object version; unlocking the selected object; and repeating 
relevant steps for each instance of the object class. 

Still preferably, the method comprises repeating relevant steps for each 
object class In the computing platform. 

Still preferably, the method comprises deploying client-side software- 
Still preferably, the method comprises destroying older version of the objects 
if multiple versions of the objects are not required. 

Preferably, the object router maintains multiple instances of each of said 
objects. 

Preferably, a failure of a first node from the plurality of nodes is detected 
when a proxy object from a second node is unable to reach the first node. 

Preferably, a failure of a node from the plurality of nodes is detected when 
the node fails to respond to heartbeat requests from the object router. 

Preferably, a failure of node from the plurality of nodes Is detected when the 
node fails to report either status or statistics to the object router. 
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Preferably, when a node from the plurality of nodes falls, said node being 
called a failed node, said at least one object router responds using a process 
comprising: notifying other object routers other than said at least one object 
router; selecting an object that resided in the failed node; making replicated object 
corresponding to the selected object residing in nodes other than the failed node 
primary; locating additional computing resources within the computing platform; 
creating new replicates for the newly created primary objects and locating them in 
nodes other than nodes in which they originally resided; and notifying said other 
object routers of new location of the new replicates. 

Still preferably, the method comprises notifying a system administrator about 
the failed node being off-line. 

Still preferably, the method comprises repeating relevant for all objects in 
the felled node. 

Preferably, when an object router.from said at least one object router fails, 
said object router that fails being called a felled object router, the computing 
platform responds using a process comprising: utilizing services of object routers 
other than said felled object router; replicating the failed object router to create a 
replicate object router; locating the replicate object router on at least one 
functional node ; and notifying at least a subset of the nodes of a new location of 
the replicate object route;*. 

Still preferably, the method comprises notifying at least a subset of other 
object routers of the new location of the replicate object router- 

Preferably, said object router Is aware of at least a subset of other object 
routers. 

6 
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Another aspect of the disclosed teachings Is a network computing platform 
comprising a plurality of nodes, said system comprising: at least an object router, 
wherein said object router Is capable of managing resources and distributing load 
evenly across the plurality of nodes, and wherein said object router maintains a 
knowledge of locations of at least a subset of all nodes in the computing platform. 
Preferably, the object router relocates objects within the plurality of nodes. 
Preferably, the object router maintains multiple Instances of each object that 
Is deployed in the platform. 

Preferably, each of said nodes comprise: an object melder that treats the 
plurality of nodes as a collection that performs melding to create said plurality of 
nodes and said at least one object router. 

Preferably, the platform further comprises an object versioner that maintains 
multiple versions of all objects within the platforms. 

Preferably, the platform further comprises an object replicator that replicates 
an object to maintain a duplicate of the object in a node different from the node in 
which the object resided. 

Preferably, the platform further comprises an object migrator that enables an 
object to migrate from one node to another. 

Preferably, the platform further comprises a resource monitor. 
Preferably, the platform further comprises a resource controller. 
Preferably, the platform further comprises an application object. 
Preferably, each of said nodes further comprise: a class loader that is 
responsible for loading classes of objects, wherein the class loader permits multiple 
versions of a same class to co-exist. 
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Preferably the platform further comprises an evolution module, wherein the 
evolution module allows the computing platform to perform evolution, wherein said 
evolution provides for transitioning of system software and hardware such that a 
seamless upgrade is possible. 

Preferably, a failure of a first node from the plurality of nodes is detected by 
the platform when a proxy object from a second node is unable to reach the first 
node. 

Preferably, a failure of a node from the plurality of nodes is detected by the 
platform when the node falls to respond to heartbeat requests from the object 
router. 

Preferably, a failure of node from the plurality of nodes is detected by the 
platform when the node falls to report status or statistics to the object router. 

Preferably, when a node from the plurality of nodes fells, said node being 
called a failed node, said at least one object router is capable of notifying other 
object routers other than said at least one object router. 

Preferably, W hen a node from the plurality of nodes fails, said node being 
called a failed node, said at least one object router is capable of notifying other 
nodes 

Preferably, when a node from the plurality of nodes fails, said node being 
called a felled node, said at least one object router Is capable of making replicated 
object corresponding to at least a subset of objects residing in nodes other than the 
felled node primary. 

Preferably, when a node from the plurality of nodes fails, said node being 

called a failed node, said at least one object router is capable of locating additional 
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computing resources within the computing platform and creating new replicates for 
newly created primary objects and locating them in nodes other than nodes In 
which they originally resided. 

Still preferably, the platform Is capable of notifying said other object routers 
of new location of the new replicates. 

Still preferably, the platform Is capable of notifying a system administrator 
about the failed node being off-line. 

Preferably, when an object router from said at least one object router falls, 
the computing platform Is capable of utilizing services of object routers other than 
said failed object router. 

Preferably, when an object router from said at least one object router fails, 
the computing platform is capable of replicating the failed object router to create a 
replicate object router and locating the replicate object router on at least one 
functional node. 

Still preferably, the platform is further capable of notifying at least a subset 
of the nodes of a new location of the replicate object router. 

Preferably, said object router is aware of at least a subset of other object 
routers. 

Another aspect of the present invention is an object router for a network 

computing platform comprising a plurality of nodes, wherein said object router is 

capable of managing resources and distributing load evenly across the plurality of 

nodes, and wherein said object router maintain a knowledge of locations of at least 

a subset of all nodes In the computing platform. 

Preferably, the object router relocates objects, within the. plurality of nodes. 
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Preferably, the object router maintains multiple instances of each object that 
Is deployed In the platform. 

Preferably, when a node from the plurality of nodes fails, the object router is 
capable of notifying other object routers other than said object router. 

Preferably, when a node from the plurality of nodesf ails, the object router is 
capable of making replicated object corresponding to at least a subset of objects 
residing in nodes other than the failed node primary. 

Preferably, when a node from the plurality of nodes fails, the object router is 
capable of locating additional computing resources within the computing platform 
and creating new replicates for the new primary objects and locating them in nodes 
other than nodes in which they originally resided. 

Preferably, the object router is aware of at least a subset of other object 
routers. 

The disclosed teachings also include a computer program product including a 
computer-readable medium comprising instructions to enable a computer to 
Implement the disclosed methods. 

DL BRIEF DESCRIPTION OF THE DRAWINGS 

The above objectives and advantages of the present Invention will become 
more apparent by describing in detail preferred embodiments thereof with reference 
to the attached drawings in which: 

FIG.l (a) shows a general network platform to which the techniques 
disclosed can be implemented. 

FIG. 1(b) shows a platform embodying the disclosed techniques. 

10 



WO 02/44835 PCT/US01/43154 

FIG.2 shows the architecture of a node with the disclosed platform. 
FIG.3 (a)-(c) show three examples of replication. 
FIG.4 shows an object version tree. 

FIG. 5 (a)-(b) show an embodiment of embedded object versioning. 
FIG.6 shows an embodiment of object transformation. 
FIG. 7 shows an example of classloader chaining. 
FIG.8 shows an example of classloader hierarchy with multiple object 
versions loaded. 

FIG.9 shows examples of locations where a system embodying the disclosed 
techniques can be installed. 

IV, DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Preferred Embodiments of the disclosed teachings and techniques, are 
provided that enable the efficient distribution and Youting' of software objects 
across many machines while automating the configuration and upgrade of both 
hardware and software elements. 

The preferred embodiment described herein creates a highly scalable, fault 
tolerant computing platform that Is as easily maintainable as a Redundant Array of 
Inexpensive Disks (RAID) based drive array. It could be referred to as a RAIC™ 
"Redundant Array of Inexpensive Computers™". 

Using the disclosed techniques, the preferred embodiment Incorporates 
characteristics that allow for dynamic hardware and software reconfiguration and 
growth. It achieves this by Introducing several unique features Into the platform as 
follows: 

11 
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♦ Melding:. Machine/Node configuration Is highly automated. 

♦ Auto Install: Software Is automatically loaded and distributed across the nodes, 

♦ Object Routing: Objects are automatically routed across the nodes to Increase 
scalability and reliability. 

♦ Automated Recovery: The computing platform Is self-healing in nature, when 
node or component failure occurs. 

♦ Evolution: Through automated management of software versionlng and 
dependencies, the computing platform can be seamlessly upgraded without any 
noticeable Interruption of services to end-users. 

Operation of the preferred embodiment may be subdivided Into a number of 
composite processes. It should be noted that these composite processes or process 
steps may be used separately or In combination, as claimed. The scope of the 
disclosed techniques is not limited to any particular combination. 

Atypical network computing platform is shown in FIG.l. A first plurality of 

gateway nodes 1.2 are connected to the Internet 1.1. A plurality of processing 

nodes 1.3 are connected to the first plurality of gateway nodes 1.2. A second 

plurality of gateway nodes 1.4 are connected to the processing nodes 1.3. Legacy 

Data Sources 1.4, RAID 1.5 and management equipment 1.6 are connected to the 

second plurality of gateway nodes. FIG. 1(b) shows a platform embodying the 

disclosed techniques. The object routers are marked OR. The objects are marked OJ 

and replicators are marked V. 

FIG.2 shows a node architecture for the preferred embodiment. During 

system initialization/boot performed at 2.1, a signaling process called melding 
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occurs. The melding process is performed by a melder. During the process of 
melding, each node discovers other nodes and components present in the network 
and informs the network of its system resources including processing power, 
memory, storage space, and network capabilities. Melding also includes the 
assignment of an object router 2.2 responsible for directing object access requests 
to and from each node. Object routers monitor resource utilization across all nodes 
and distribute requests accordingly. Object routers also maintain multiple instances 
of each object (located on disparate nodes) to prevent single points of failure within 
the network. It should be noted that In the disclosed system, object routers may 
serve as a replacement for more primitive remote object lookup services (including 
but not limited to services such as COSNaming specified by OMG). 

In order to handle the shutdown or failure of an individual node within the 
network, processes called migration and replication, performed at 2.3, provide fault 
tolerance and the ability to move live (executing) objects across nodes. Migration is 
performed by the object migrator. Replication is performed by the object replicator. 
These processes are also utilized during hardware upgrades, during which time 
individual nodes are effectively shutdown and unavailable. While individual nodes 
are unavailable, this remains transparent at the system level and the overall 
system of nodes remains highly available to all users. 

In order to accommodate software changes, a process named evolution 
utilizes the above named capabilities In conjunction with transform functions and 
sophisticated object verslonlng techniques. This enables seamless transitioning of 
software components while keeping the system available to end users. 
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Some of the processes and components of the platform described briefly 
above are discussed in further detail herein below. 



WJL Melding 

Melding Is the process of individual computing nodes joining the larger 
'collective' of nodes. 

During melding, a method such as the internet protocol DHCP is utilized to 
assign IP or other network addresses for each node. 

Then the bootstrap of core components including a module/class loader 
occurs. It should be noted that, each node's core components are updatable from 
data stored in a *code version server" (http, ftp or other network protocol server 
enabling data exchange) during startup. 

Next, the location of an object router is determined. After the node boots, it 
attempts to find a nearby object router node utilizing multicast packets, broadcast 
packets, or a w well known" (preconflgured) lookup address (such as in an Internet 
protocol DNS server). If no object router is located, the node loads the 'router 
module' and becomes an object router node. In the preferred embodiment, it is 
ensured that at least one router 'mirror' exists at all times. 

Then, resource reporting Is performed. During this reporting, the node 
reports its available hardware and software resources to Its object router . 
Statistics reported can Include available memory, processing units, drive storage 
space, network bandwidth, software installed, and software versions. 
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IV.B. Object Router 

The object router keeps track of available resources. The object router 
tracks utilization of resources within the nodes it manages. It is responsible for 
distributing load evenly across its nodes. 

The object router also performs remote object tracking. The object router 
maintains a knowledge of location of objects located within its nodes. It is 
responsible for relocating objects across nodes when required. It should be noted 
that location can include but Is not limited to a network address and port number, 
DNS name, URL, or remote object reference as specified In a remote object system. 

/KG Resource Monitoring/Control 

The preferred embodiment provides for resource monitoring and control. 
Resource monitoring is performed by a resource monitor module. Resource control 
is performed by the node resource controller. Both of these modules are utilized by 
the object router to monitor and control the state of individual nodes. 

In a Java based Implementation, these modules can be Implemented utilizing 
JNI. In such a case, standardized interfaces for accessing operating system kernel 
structures such as process IDs, threads, memory utilization, network activity, disk 
activity, and related statistics, are created. These Interfaces are then Implemented 
with custom JNI wrapper objects. 

In such a Java based Implementation, JNI wrapper objects also provide a 
mechanism to control process priorities, threads, operating system configuration, 
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and system reboot. Such wrapper objects urHfy these operations across a variety of 
operating systems. 



[V.D. Migration: 

Migration Is the process of moving an object from one node to another. 
Migration is performed by the object migrator. 

First an object Is locked. This Is done to prevent state Inconsistency during its 
migration. 

Then the object Is serialized. During this step, the object Is converted to a 
byte stream for transit across the network. 

Then the object is relocated. During this step the byte stream is converted 
back to an object at an available node. 

Finally the object is unlocked. The location of the object is updated in the 
object router and made available for use. 

IKK Replication 

Replication is the process of duplicating an object across multiple nodes. 
Replication Is performed by the object Replicator. FIGs.3(a)-(c) show three 
embodiments of object replication. 

IV.R_l.Mbde 1 

FIG.3(a) illustrates one example called "mode 1" of replication whereby an 
operation is initiated by the client of an object. This client request may come from 
either objects located on another node or server, or a software application. The 
operation (or method call) Is sent to a local proxy object in the originating node, 
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which (after having obtained location of the remote, replicated objects from an 
object router), sends that request to the first (or primary) remote object located at 
Node A, Node A Immediately forwards the same marshaled operation to a 
replicated object at Node B r then the primary object at Node W proceeds to 
perform the operation, modifying its state as required. The replicated object at 
Node B simultaneously performs the same operation, and sends any result set 
(return parameters) back to the proxy object at the node of origination. Optionally, 
a checksum value may be calculated at both remote nodes on either the result set, 
the internal state variables of the object, or both. This checksum step is utilized to 
detect errors, and correct faults that occur. 

The mode of replication illustrated in FIG.3(a) has the advantage that it 
minimizes network traffic between the nodes and reduces the likelihood of network 
congestion at the originating node. It also ^guarantees" (to a degree) that all 
replicates are kept synchronized with each operation. A potential negative of this 
approach Is increased message latency due to the additional network 'hop' required 
from Node A to Node B. 

Note that this approach can easily be extended by adding additional nodes In 
the chain between Node A and Node B with slight latency penalty for each 
additional node/replicate added. 

IV.E.2.Mode2 

FIG. 3(b) Illustrates a second example, called n mode 2", of replication where 

an operation Is Initiated by the client of an object (which may come from either 

objects located on another node or server or a software application). The operation 

(or method call) Is sent to a local proxy object In the originating node, which (after 
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having obtained location of the remote, replicated objects from an object router), 
sends that request simultaneously to both remote objects located at Nodes A and 
B. The replicated objects at Nodes A and B perform the operation In parallel, and 
the 'primary' remote object Is responsible to send a response to the proxy object 
located in the originating node. 

Optionally, checksum values of result sets, Internal state variables, or both 
may be sent to the local proxy object, or as a "cross-check" between the remote 
objects. This checksum value Is utilized to detect errors and Insure consistency. 

Note that this approach can easily be extended by adding additional nodes 
that the proxy communicates with, however a tradeoff in terms of network 
bandwidth utilization is to be expected. This approach reduces expected message 
latency, but Increases overall network utilization. 

TV.R3.Mode3: 

A third technique, shown In FIG.3(c), Involves the proxy object 
communicating directly with a primary remote object, whose state changes are 
mirrored to one or more secondary remote object(s) that have 'subscribed' to 
events reflecting state changes within the primary object. Again, checksum 
messages can be added to improve system robustness. 

All three techniques, modes 1-3, utilize a local proxy object In order to isolate 
the application programmer from the complexities of the replication process. The 
proxy object serves as a local surrogate for the remote objects, thereby eliminating 
the need for an application programmer to write network, checksum, and error 
recovery code that can be handled by the proxy and object router working 
together. 
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In practice, one or more of these techniques may be utilized to maintain 
replicated state between remote objects, and hybrid approaches may also be 
feasible. 

In all the modes, the following operations are common: 

Object locked: object is locked to prevent state Inconsistency. * 

Object cloned: the object is duplicated through serialization. 

Object relocated: the duplicate object is remotely located at a secondary 

node. 

Proxy object / Event subscription: object operations are replicated. 
Object unlocked: object Is unlocked and made available for use, 

rV.F. Evolution; 

Evolution Is one of the more unique and complex processes within the 
preferred embodiment platform. It relies on a combination of other features, 
including object migration, advanced versioning, specialized classloaders, and 
transform functions. 

IV.F.1 . Compiling/ pre-post processor: 
This refers to processes performed either prior/during/or post-compilation. 

Additional steps are performed on either the source or object code to insert the 

additional versioning information required by evolution (prior to software 

deployment). This Is a step that would normally be executed by the software 

developer or a software system integrator or vendor. Several approaches are 

described below. 
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a) Pre-Compilation 

Prior to executing a compiler on the source code, a preprocessor can be 
executed that parses the source code and inserts additional (source) code within 
each class that stores information reflecting the version required for each external 
class utilized by that class. For Instance, class % A* relies on classes *B', X C, and % D/ 
Prior to compilation, code would be inserted into class X A' indicating the versions of 
classes X B', y C, and *D' that are required, along with the version (most likely a 
computed hash of the class definition) of class S A\ This type of a process is 
sometimes referred to as "source code insertion" 

b) During Compilation 

A specialized source code compiler can be constructed that Inserts the 
versloning Information, similar to that shown in FIG.5(a), automatically during 
the compilation process. For each object being compiled, the compiler's 
"external symbol table" contains references to each external class or method 
referred to by the class being compiled. An extended compiler design would add 
version information to the external symbol table and store this information In 
the produced object code for each class. This would be an extension or revision 
to the language's compiler. 
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c\ Post-Compilation: 

It is possible to read/process Java object bytecodes (or other 
generated object code) after compilation and Insert bytecodes which contain the 
versioning information. This is performed typically as shown in 5(a), the 'embedded 
object versioning" diagram, A software tool to perform this process would typically 
be executed by a software developer or independent software vendor (ISV). The 
biggest single advantage of this approach Is that it would be feasible to apply this 
to third party software for which source code is unavailable. Methods 1 and 2 
above do not work as well for software without available source code. This could be 
referred to as ^object code insertion." . 

It should be noted that Java Object ByteCodes are roughly equivalent to 
assembly/machine code produced by the compilation of C/C++ software. Similar 
methodologies may apply for non-Java based systems. 

An alternate approach to storing the versioning information directly within 
the object classes (which currently appears to be the favored approach) is to store 
this Information externally as a lookup table stored In the node(s) or made available 
to the node(s) on some alternate data server or node. Each node's Classloader 
could then access this Information during class loading without It being directly part 
of the object. 
For all scenarios: 

The specialized ClassLoader residing within each node relies on the 'extended 

versioning' Information that Is made available for each class to load the appropriate 
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version of each object required In the dependency tree. When the Node's 
Classloader loads Version x l' of Class 'A' It gains the knowledge that it needs to 
acquire/load version x 2' of Class x &* and version '3' of Class % C 

Also, while Non-Java based classloaders are hot commonplace today, In 
theory such constructs could be created (I.e., a C++ Classloader or an Oracle™ 
database Classloader) for any object based language. The version numbers of the 
classes being compiled would be stored along with the version numbers of the 
external classes. 

Utilizing computed hash values for each class Is a preferred method for 
producing version numbers (due to uniqueness), although arbitrary version 
numbers [such as labels from a configuration management products like ClearCase, 
SourceSafe, CVS, or Continuous could also be utilized]. This process may be used in 
conjunction with source code repository/configuration management tools - such as 
the above mentioned products. 

IV.F.2. Version hierarchy: 

FIG.5(a) shows an example of version information that may be associated 
with objects/classes. This Is intended to illustrate the relationship between object 
class versions. Version numbers illustrated In this figure are arbitrary. Every object 
class Is assigned a "VersionlD," however the version table may also be stored 
externally to the class. 

The version hierarchy is formed during either: 

a) The execution of the source code pre-processor 

b) The execution of the ^modified" compiler 
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c) The execution of the object code post-processor. 
The version hierarchy Is normally 'implicitly stored' at runtime/deployment 
time within the objects: 

Ie: Object Class 'A' has knowledge that it depends on 'B' and 'C 
Object Class 'B' knows that It relies on Class 'D' 
Object Class *C knows that it relies on Class 'E' 
This would implicitly form the hierarchy of: 
A -> B -> D 
A -> C -> E 

Or, pictured another way: 
A 

B C 
D E 

This tree could expand indefinitely - and the 'terminal' nodes of the tree are 
(ideally) the Operating System and/or Hardware component identifiers. 
(In the absence of available information through the hardware layer, the 
deployment process will take a 'best effort' approach to deployment and attempt to 
find a close fit.) 

For example, Object Class *C may depend on Object Class 'E' which depends 
on Java J RE 1.2 which depends on Solaris 8, OS patch version 482 which depends 
on 128MB RAM, 200MB drive storage, and a 100MHZ SPARC chip version 'E' or 
later. 
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Within traditional systems, object dependencies are not normally utilized 
outside of the compiler and normally do not extend beyond the confines of the 
language. After compilation, such information is typically discarded. The disclosed 
system stores and utilizes both object dependencies and versioning information 
throughout deployment and runtime of the objects. This enables enhanced 
software object tracking and transitioning. FIG.4 shows an object class version tree. 
In this figure, w v" represents the version number. The Object Class names in the 
figure are arbitrary and this diagram can also illustrate computer language 
versions, operating system versions, and hardware versions, (le: "Altitude" could be 
replaced with "Java" or "Linux" or "Intel Pentium").To form the Version Tree 
Hierarchy the following Information sources are relied upon: 

a) Compiler (or compiler-like) algorithms that ensure method-signature 
compatibility among objects and components to be deployed. 

b) Programmer specified or Configuration Management Tool specified version 
dependencies (CVS, Continuus, Clearcase, SourceSafe, etc). 

c) Information specified by vendors of third party software: 

IE: Oracle, Sun, IBM, and other vendors specify the platform on which 
their software may be deployed and often provide this Information in the 
form of "Release Notes" documentation. 

d) Information system producers (or other outside parties) gather through 
other means and add to a repository of compatible configurations. 
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IV.F.3. Version lo okup table 

FIG.5(b) provides an Illustration of how the version table may be stored 

separately from the object class. This Information may be stored In one or more 

nodes or In an external server or database. A lookup table exists within each object, 

consisting of a list of all Objects required by this object, and the appropriate version 

of each object to load. The Version lookup table' should be apparent from the 

information provided In the hierarchy and pre-processor sub-sections. The lookup 

table is a set of relationships between specified Classes and the versions of the 

Classes they depend on. A lookup table exists within each object, consisting of a 

list of all objects required by this object, and the appropriate version of each object 

required. The lookup table is a set of relationships between specified Classes and 

the versions of the Classes they depend on. This is stored either: 

a) within each object to be deployed. 

b) within one or more node(s). 

These relationships are utilized by the object Classloader in order to 
determine dependencies when objects are loaded within the node(s). 



TV.F.4. Custom Classloaders 
Specialized classloaders are utilized that permit simultaneous loading of 

multiple object versions. Traditional systems (and their classloaders) only permit 

the loading of a single version of an object class at any given time. 

TV P s Transform flxVprocess: 
FIG.6 illustrates an object/class transform mapping scenario for an object of 

class "Employee". It should be noted that the transform functions could be 
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combined Into a single transform function, or further subdivided Into more 
functions. In this figure, VI is version 1 and V2 Is version 2. 

As shown in FIG.6, for each object instance the following steps are 
performed: 

1) Lock Object 

2) Create version 2 (v2) proxy/Instance 

3) Apply transform function 

4) Subscribe Change Events 

5) Unlock Object 

For each new version of an object, a series of transform functions are 
created. The purpose of the transform function is to Mranslate' an object from a 
prior version to the current one. For many common transforms, pre-defined 
functions may exist. For more difficult transforms, programmer intervention may 
be required. 

IV.R6. Preferred Embodiment of Evolution 

The preferred embodiment of the general evolution process is described herein. 

1) Evaluate the software to be installed or upgraded: 

a. Create a version hierarchy tree of all objects at compile time. 

b. For each class of objects, perform an object class *dlff comparison 
between the old class versions and newly created versions. 

c. For each state variable or method signature that has changed, 
create a transform function. 

d. At deployment time: for each class of objects to be 
upgraded/Installed, do: 
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i. Loop for all object Instances of each class: 

1. Lock object to prevent state changes 

2. Create an uninitialized Instance of the new object class 
version and corresponding proxy objects and 
replicates as required. 

3. Initialize the variables of the new object class Instance 
by applying the transform function to the prior object 
version, 

4. Subscribe to change events. 

5. Unlock object. 

e. Upon completion, there will be a 'mirror' copy of the original 
system, containing new versions of all objects. At this point, new 
^client-side" software is deployed. 

f. Upon completion of new ^client-side" software, prior versions of 
objects may be disposed of. Alternately, multiple versions may be 
retained, but with significant cost of resource utilization. 

The preferred embodiment incorporates several extensions to objects within 
the platform to facilitate the mechanisms required for migration and replication. 
Objects within the preferred embodiment described herein support the following 
'base class' extensions : 

• Object locking: Objects support distributed read and write locking 
mechanisms. 
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• Change Events: Change events are generated when a state change 
occurs within the object These change events may be remotely 
subscribed to across nodes. 

• Versioning of objects: Objects are assigned a version ID that is either, 
the computed hash of their Interface or an assigned value by 6Ither the 
developer or configuration management tool. Additionally, objects are 
aware of the version IDs of objects they utilize. 

• Objects serializable: Remotely accessed objects have the ability to be 
converted to a byte stream for transit across a network. 

• Object Extension: Objects are run-time extendible through object 
references. 

TV.R7. Evolution of Hardware 
Evolution of Hardware Is handled in a somewhat different (though related) 

fashion than Software: 

During upgrade or maintenance of a piece of hardware: 

a) The 'state' of the node to be upgraded, replaced, or maintained is changed to 
reflect a 'down', 'offline/ or 'upgrading' state within the ObjectRouter(s) 
managing it. 

b) The objects residing on the node to be upgraded are Youted' to alternate 
nodes, (similar to a failure scenario). 

c) The hardware is replaced, modified, or upgraded through whatever 
procedure is required (le: opening the case and replacing memory, drives, 
etc). 

d) The hardware is restored and brought back on line. 
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e) The hardware undergoes the 'melding' procedure to (re)joln the 
'collective' of nodes. 



The following paragraphs describe how hardware evolution Is related to 
software evolution: 

It is possible In a 'dependency hierarchy' that upgrading the software for a 
given node will also require a hardware upgrade (it is anticipated that the 
ObjectRouter will have detailed knowledge of each node's hardware capabilities). 

For instance, a node may require additional memory or a new audio card to 
support the latest version of Java upon which the application software is based. 
The disclosed system's dependency tree goes deeper than "software object 
versions" and extends into language versions, runtime library versions, and 
Operating System requirements as well. Operating System features may In turn 
rely on hardware availability. This means that at deployment time, a piece of 
software may notify the installer that the hardware required to support It Is 
unavailable. This would then lead into the hardware upgrade (mentioned above) 
required to support the software transition. 

IV.G. Classloader changes; 

Classloaders are a mechanism utilized within some object-oriented languages 
for loading object classes Into memory. A common technique for changing the 
behavior of class loading within such systems is sometimes referred to as 
"classloader chaining." Utilizing this technique, varying behaviors can be given to 
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various classes loaded within the system. An example of classloader chaining is 
shown in FIG. 7. For instance, a "Primordial Classloader" 7.1 could be used to load 
language specific objects and grant these objects maximum permission to access 
the host system's files, memory, and network. Sub-classloaders which are 
'chained' to the parent (primordial) classloader could behave differently*. For 
instance, The "BootTime ClassLoader" 7.2 could add the behavior of loading classes 
stored in a remote server. The ClassLoader associated with Application Objects 7.4 
could place security restrictions on those objects, prohibiting them from accessing 
the local machine's files. One particular limitation of a standard classloader 'chain' 
is that it provides no mechanism for handling multiple versions of a single object 
class. 

The system described herein has solved this problem by creating a 
ClassLoader Hierarchy, rather than a chain structure. FIG.8 shows an example of a 
classloader hierarchy with multiple object versions loaded. Utilizing a classloader 
hierarchy permits a host node In the disclosed system to load multiple versions of 
the same object class simultaneously. 

This, when combined with other disclosed techniques permits a seamless 
transition between software versions. Using traditional classloader techniques, one 
would need to terminate the software process and restart it In order to load a new 
version of a class that had already been loaded Into memory. 
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IV.G.l. Example: 

A User or the System starts a software process and loads class "Truck", 
version 1: 
class Truck { 

static String versIonID = 1; 

public void startEngineO { 

} 

public void moveFaster() { 
} 

} 

Later, the definition of class Truck" Is changed and version 2 is produced, as 
shown below: 
class Truck { 

static String versionID = 2; 

public void warmGlowPlugs() { 

} 

public void startEngineO { 
> 

public void accelerateQ { 
> 
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Normally, a restart of the process would be required In order to load version 
2 of Truck. Additionally, loading version 2 of "Truck" would remove version 1 of 
class "Truck" from memory and break any software relying on usage of the 
moveFaster() method In version 1. Utilizing the specialized classloader hierarchy 
design, both version 1 and version 2 can co-exist within the process without any 
interruption of service to clients or other objects relytng on either version 1 or 
version 2. 

IV.H. JVM Extensions: 

Several extensions are provided to the Java Virtual Machines (JVM) as 
follows: 

• Extended Process Control: APIs are provided for tracking process 
resource utilization, starting, changing the priority of, and stopping 
processes. 

. Extended Thread Control: APIs are provided for obtaining greater 

control over executing system threads. 
. Virtual Processes within VM: Pseudo processes within the VM are 

created through a combination of threads/thread groups and specialized 

classloader mechanisms. 
. Specialized Classloader with support for versioning: Support Is 

provided for loading multiple versions of the same object simultaneously 

within the same VM. 
It should be noted that not all the above extensions are strictly required, but 
they provide for better system operation. 
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IKL Failure of object router with multiple instances of object routers: 

The disclosed technique provides for recovery from failure of object routers. 
If there are two or more nodes functioning as object routers, each object router will 
be paired with at least one secondary mirror. Each object router is aware (directly 
or Indirectly through a hierarchy) of other object routers within the system. 
A failure is detected in a fashion similar to what is described in section IV.I. 
Upon failure the following steps are performed. 

1) Nodes utilize the services of secondary object routers (which become 
primary). 

2) The Yeplicated' object router is 'cloned' (using techniques similar to the 
object replication described In section IV. E. above) and activated on at least 
one secondary node. 

3) All nodes managed by the object router are notified of the new secondary 
location(s). 

4) Other object routers are notified of the new secondary location(s). 

In addition to the above steps, the following steps are optionally performed: 

1. System Administrator Is notified of failure. 

2. Other systems are notified of failure. 

3. Failed object router Is requested to reboot Into a 'recovery' mode. 
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IK/. Failure of object router when only one instance of object router exists: 
Upon failure (detected as described In section IV.I), the following steps are 
performed: 

1) Remaining nodes 'elect' a new node to function as object router 

(if only one node exists, that node will perform both application processing 
and router functions) 

2) Node elected to function as object router starts object router module and 
reads last known state of prior router from a database, set of flat files, or 
other data server. [If possible] 

3) Announcement Is broadcast to nodes notifying them of the availability of new 

object router. 

4) Newly started object router requests from each node It directly manages: 

a) Current status Including system capabilities and current Moad/utlllzatlon' 
statistics. 

b) List of current objects residing on the managed node. 

In addition to the above steps, the following steps are optionally performed: 
System Administrator Is notified of failure. 
Other systems are notified of failure. 

Failed object router is requested to reboot Into a 'recovery' mode. 
IV.K. Failure of Node 
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When a node within the disclosed system falls, that failure can be detected through 
several means Including but not limited to: 

1) A 'proxy' object from another node Is unable to reach the node. The failure is 
reported to the object router. 

2) A node falls to respond to 'heartbeat' requests from the object router. 

3) A node fails to report either status or statistics to the object router within a 
given time span or upon request. 

When any node failures are detected, the object router responds by: 

1) Notifying other object routers and managed nodes that the node has failed. 

2) For each object that resided on the failed node: 

- the replicated objects (located on other nodes) become live/primary. 

- Object router locates additional computing resources within the system. 

- Object router locates the remaining object 'replicates' and makes at least 
one additional "clone" / replicate on secondary nodes. 

notifies other object routers and nodes of the new location of the 
replicates. 

3) Notifies the system administrator that the node has failed and Is now 'offline' 

4) Optionally notifies other systems that the node has tailed. 
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5) If possible, reboots the system node into Yecovery' mode - which attempts to 
perform self-diagnostics and repair. (Reinstall/checksums of installed software and 
hardware, and other related diagnostic tasks are performed). 



It is assumed that every object within the system will normally have at least 
one replicate at all times. In higher availability configurations, objects may have 
multiple replicates located on multiple nodes, Increasing the fault tolerance level. 
'Standard" configurations (with at least one replicate) will tolerate the failure of any 
one node within a small time delta. "Highly-redundant" configurations (with 
multiple replicates) may tolerate multiple node failures simultaneously. 

IV.L Encryption 

Within the disclosed system, all messages and network traffic sent between 
nodes may be encrypted utilizing techniques including but not limited to: public and 
private key encryption, digital certificates, digital signatures, secure message 
digests, and shared private key encryption of data. 

Additionally, the encryption algorithms and approaches utilized may Include 
but are not limited to: DES, triple DES, BlowFlsh, RSA public key, SSL,TLS, PKCS, 
IMAP, LDAP, Kerberos, S/MIME, RC4, Diffle-Hellman, and DSA .public key. 

Various authentication schemes may be utilized. This Involves a variety of 
digital signature and biometric approaches including but not limited to: PGP, LDAP, 
Kerberos, thumbprint scanners, handprint scanners, retinal scanners, volceprint 
recognition, passwords, one-time passwords, hardware based key generators, 
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smart cards, embedded devices, or any combination thereof. Such authentication 
schemes may be applied at the system, user, transport, object, or binary level. 



Similar encryption and authentication techniques may also be applied to all 
messages, checksums, and network traffic sent between components of the system 
including but not limited to melders, object routers, migrators, replicators, code 
version servers, classloaders, node controllers, and node resource monitors. 

Furthermore, within the disclosed system, messages and network traffic sent 
to user applications, end-users, and external systems may also be encrypted and 
authenticated in a similar fashion. 

Those skilled in the art of encryption may recognize other hardware and 
software techniques which could also be easily applied to the disclosed system. 

IV.M. End-User Benefits: 

The end-users of a computing platform embodying the disclosed techniques 
and teachings experience a variety of benefits. FIG.9 shows examples of locations 
where a system embodying the disclosed techniques can be deployed and utilized.. 

For the purpose of this disclosure, a Client System is described as a machine, 
computer, device, or software presenting a user Interface to an end user of 
computing services. 

A Client-Server system that has a client system that Interacts with a server 
system utilizing the disclosed teaching's and techniques, and hosting customer's 
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services experiences these benefits. The Client System will not experience 
performance degradation or failure due to hardware/software upgrades or failures 
within the Server System. 

Similarly, a Server-Server system with a server system W interacting with a 
server system *B' utilizing the disclosed teachings and techniques, and hosting 
customer's services experiences these benefits. The Server System *A' will not 
experience performance degradation or failure due to hardware/software upgrades 
or machine failures within the Server System X B' 



IVM End-User Experience: 

The flexibility of the preferred embodiment embodying the disclosed 
teachings and techniques permits it to be deployed la any system that requires 
growth, stability, and optimal resource management. From an end-user's point of 
view, the system meets the same standard of dependability that basic utilities such 
as water, electricity, and phone provide. 

When software application users (or'Web surfers") encounter messages like 
"Server cannot be located...," "Host did not respond...," "System unavailable...," or 
timeouts Including no response from the server, they are annoyed and try again 
for perhaps a second time before giving up. Consumers of today's networked 
systems assume that better technology Is being constructed but currently available 
systems are not yet ready to provide these essential services . Without solving 
chronic problems hidden In the computing environments that provide the backbone 
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services users depend on, these users' hopeful progress will instead turn into 
more painful experiences. 

Everyday, we watch TV, listen to radio, read papers, and talk on the phone. 
These are some of the 'essential" activities that we take for granted and demand 
be available to us 24 hours a day 7 days a week. In addition to the traditional 
necessities (such as telephone and electricity), we Increasingly look toward ways to 
use continuously evolving Information technology to bring new services for better 
productivity and entertainment. Services like on-demand video and audio over the 
internet will soon be widely accepted and these services will be expected to perform 
both continuously and reliably. An individual watching the SuperBowl broadcast 
over the Internet would expect perfect delivery of the event (including 
commercials). A system that embodies the disclosed techniques guarantees this 
through Its multi-layer redundancies. As more and more people subscribe to 
events broadcasted over the Internet (say from 1,000 to 1,000,000), such a 
system scales without ever needing to disrupt the services. Existing and new users 
all enjoy the service as if they are the first and only user. When a new feature such 
as a 3D broadcast Is made available, users relying on the disclosed system are 
given the choice to add this feature without experiencing degradation of current 
services or needing to wait for system expansion. More familiar examples Include 
the outages and flux of services that have occurred at major corporations Including 
Ebay, Hotmail, Yahoo, and AOL as systems have strained to accommodate rapid 
growth and change. A system using the disclosed techniques, provides the end 
user with a true 24 hour X 365 day environment. Most production systems today 

have a "schedule" of downtimes for planned software and hardware maintenance. 
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Even during hardware and software upgrades a system embodying the disclosed 
techniques and teachings remains available for all users. End users can continue 
interacting with the system ignorant of the fact that a software or hardware 
upgrade is underway. This provides unprecedented levels of availability. An 
important feature of the system embodying the disclosed technique is that the 
users will not experience the downtime typically associated with machine failures, 
replacements, and upgrades. 

IV. O. Customer Benefits: 

A Customer of a system that uses the disclosed techniques Is a direct 
purchaser of the system who utilizes It to provide reliable services to end-users. 
Ukely customers Include Fortune 500 Corporations, e-commerce/web companies, 
System Integrators, and Application Service Providers (ASPs). 

Customers experience these benefits: 

1) Reliability, Scalability, Availability: The system Is by its nature highly redundant 
/ fault tolerant and scales easily to meet Increasing system demand. 

2) Ease of Use: The system simplifies administration of large numbers of machines. 
Once a node In the system Is loaded with necessary components, that system Is 
administered by Itself and the object router. The entire system can be 
administered from a single unified interface. 
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3) Incremental Capacity Upgrades: The system permits easy upgradablllty by 
"plugging In" (melding) additional nodes. 



4) Failure Recovery: the system "routes around" node failures and self-heals (given 
available hardware resources). 

5) Seamless software and hardware upgrades: The system permits software and 
hardware upgrades to occur while remaining 100% available to existing users. 

IKP. Process Of Evolution 

The process of evolution Is described herein from an end -user's perspective. 
A user actively utilizing an application built with Evolution will not notice any change 
In the system behavior while the system Is evolving. Upon completion of an 
upgrade, the user will see the new features/functionality available to them. 

TV.P.l. Example 1 

Users of a video-on-demand system are selecting and watching videos for 
free on a new web site: FreeVid.com 

FreeVid.com realizes they are losing money and need to begin charging for 
their services. While users continue to select and watch videos, the system Is 
evolved and new software Is deployed that will require them to enter a credit card 
number. After the system has completed if s upgrade, users are presented with a 
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screen Informing them of the bad news, providing them with credits for 3 free 
vlewlngs, and requesting payment for continued service. 



TVP.2. Example 2: 

Users of a Bill Payment system that accesses their checking accounts are 
logged in and utilizing it's services. While they remain online, the system is 
upgraded to also accept credit card payments. 

jy.Q, From a Customer's perspective 

A customer purchasing a system embodying the disclosed techniques 
perform the tasks described herein. When the customer is ready to upgrade their 
system, they log into a system administration console. From the administration 
console, they will select a software module or set of modules to upgrade/replace. 
They then initiate the upgrade by pressing a *start' or *begin' button. A progress 
meter will show the upgrade occurring,.and upon- completion the new feature set 
will be available. The administrator may then *turn off or discard prior software 
modules that are no longer needed. 
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IV.R. From the Developer's perspective: 

A customer building software to be deployed on a system embodying the 
disclosed techniques performs the tasks described herein. The developer will 
construct software modules that comply with the APIs for the disclosed system. 
They will then compile their software. After compilation, either a pre-processor, 
language compiler or "post-processor" will analyze the class/object files and insert 
versionlng information. It will then perform an object class level 'diff between 
software versions. When possible the system will create automatic transform 
functions. In other cases, the developer may need to select from pre-defined or 
create custom "transform" functions that enable the objects to migrate from one 
version to the next. Finally, the modified class/object files and transform functions 
are packaged into an archive format for delivery and deployment. 

IKS. Computer program product 

Apart from systems and methods, computer program products are also 
within the scope of the disclosed teaching. These computer program products 
comprise instructions on a computer readable medium that enable a computer to 
perform the methods disclosed herein. The instructions are not limited, and Include 
but not limited to, source code, object code and executables. The computers on 
which the instructions are Implemented Include, but not limited to, minis, micros, 
and mainframes. The computer readable medium Includes, but not limited to, 
floppies, RAMs, ROMs, hafd drives, magnetic tapes, cartridges, CDs, DVDs, and 
Internet downloads. 
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It should be noted that the system components and the processes disclosed 
can be Implemented In any way In a selected network system. Though a software 
Implementation Is preferable, it can also be implemented by a hardware or a 
hardware/software combination. For example, a subset of components or processes 
can be implemented using pure hardware as by using ASICs. Similarly,' in the 
software implementation, there Is no restriction regarding the choice or level of 
computer languages. For example, higher level languages like Java, C++, etc, could 
be used or lower level languages including assembly and machine languages could 
be used. Or else, a combination of computer languages could be used. Though 
some of the preferred embodiments described herein might appear to reveal a 
Java-based implementation, this is not meant to be restrictive. The choice of 
language and implementation would depend on many factors including the needs of 
the user and availability of resources. 

Also, compatibility issues should not be treated as being restrictive. For, 
example, it should be noted that a system embodying the disclosed teachings and 
techniques might be loaded on existing network systems. This might cause some 
compatibility issues. However, these compatibility Issues can be overcome by a 
skilled practitioner without deviating from the scope of the disclosed teachings. 

Other modifications and variations to the. Invention will be apparent to those 
skilled In the art from the foregoing disclosure and teachings. Thus, while only certain 
embodiments of the invention have been specifically described herein, it will be 
apparent that numerous modifications may be made thereto without departing from 
the spirit and scope of the disclosed teachings and techniques. 



44 



WO 02/44835 



PCT/US01/43154 



WHAT IS CLAIMED IS: 

1. A method for creating a scalable, fault tolerant computing platform on 
a network, said platform comprising a plurality of nodes, said method comprising: 

a) initiating melding during Initialization, wherein during said melding said 
plurality of nodes are treated as a collection and each node from said plurality of 
nodes joins said collectlbn, and during melding at least a first object router is 
assigned; 

b) migrating objects between said plurality of nodes; and 

c) replicating objects across said plurality of nodes. 

2. The method of claim 1, wherein during melding each node joins the 
collection using a process comprising: • 

(a)(i) the node finding a nearby object router if the object router can be 
found, otherwise becoming the object router; and 

(a)(ii) the node reporting available resources to the object router. 

3. The method of claim 1, wherein during melding each node joins the 
collection using a process further comprising: 

(a)(IIi) assigning network address for the node; and 
(a)(lv) updating components of the node. 
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4. The method of dalm 2, wherein the available resources comprise 
memory, at least one processor and drive storage space. 

5. The method of claim 2, wherein the available resource comprises 
network bandwidth. 

6. The method of claim 2, wherein the available resources comprise drive 
storage space. 

7. The method of claim 2, wherein said updating is done using data 
stored in a code version server. 

8. The method of claim 2, wherein it is ensured that at least one mirror of 
an object router exists all the time. 

9. The method of claim 1, wherein the object router manages resources 
and distributes load evenly across the plurality of nodes. 

10. The method of claim 1 wherein the object router maintains knowledge 
of locations of at least a subset of all nodes In the computing platform. 



11. The method of claim 1 wherein the object router relocates or replicates 
objects within the plurality of nodes. 
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12. The method of claim 1, wherein said migration of an object Is 
performed using a process comprising: 

(b)(i) locking the object; 

(b)(ii) converting the object to a byte stream at a first node in the plurality of 
nodes; 

(b)(IH) transferring the byte stream across the network; 
(b)(iv) converting the byte stream back to the object at a second node In the 
plurality of nodes; 

(b)(v) updating a location of the object in the object router; and 

(b) (vi) unlocking the object. 

13. The method of claim 1, wherein said replication of an object is 
performed by a process comprising: 

(c) (1) locking the object at a first node; 
(c)(ii) forming a duplicate object, ; 

(c)(iil) locating the duplicate object at a second node; and 
(c)(iv) unlocking the object. 

14. The method of claim 13, further comprising: 
(c)(v) subscribing to events associated with the object. 

15. The method pf claim 13, wherein said duplication is performed by 
creating a byte stream from the object and converting said byte stream into a 
duplicate object. 
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16. The method of claim 1, wherein the computing platform allows for 
evolution, wherein said evolution provides for transitioning of system software and 
hardware such that a seamless upgrade Is possible. 

17. A method for performing evolution, said evolution being performed to 
enable a seamless transitioning of software, said method comprising: 

(a) evaluating a software or hardware to be installed or upgraded; 

(b) creating a dependency hierarchy tree of all objects In the computing 

platform at compile time; 

(c) performing object class dlff comparison between old and new versions of 

at least one class of objects; 

(d) creating a transform function for each state variable that is changed; 

(e) selecting an object that Is an Instance of the selected object class; 
.(f) locking the selected object; 

(g) creating uninitialized Instance of the new version of the selected object; 

(h) initializing variables of the new object class Instance by applying 
appropriate transform functions to the old object version; 

(I) unlocking the selected object; and 

(j) repeating steps e-l for each instance of the object class. 

18. The method of claim 17, further comprising: 

(I) repeating steps e-k for each object class In the computing platform. 
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19. The method of claim 17, further comprising: 
(m) deploying client-side software. 
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20. The method of claim 17, further comprising: 

(n) destroying older version of the objects If multiple versions of the objects 
are not required. 

21. The method of claim 1, wherein the object router maintains multiple 
Instances of each of said objects. 

22. The method of claim 1, wherein a failure of a first node from the 
plurality of nodes is detected when a proxy object from a second node is unable to 
reach the first node. 

23. The method of claim 1, wherein a failure of a node from the plurality of 
nodes is detected when the node fails to respond to heartbeat requests from the 
object router, 

24. The method of claim 1, wherein a failure of node from the plurality of 
nodes Is detected when the node falls to report either status or statistics to the 
object router. 
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25. The method of claim 1, wherein when a node from the plurality of 
nodes falls, said node being called a failed node, said at least one object router 
responds using a process comprising: 

(i) notifying other object routers other than said at least one object router; 

(il) selecting an object that resided In the failed node; 

(lit) making replicated object corresponding to the selected object residing In 
nodes other than the failed node primary; 

(iv) locating additional computing resources within the computing platform; 

(v) creating new replicates for the newly created primary objects in step lii 
and locating them In nodes other than nodes in which they originally resided; and 

(vl) notifying said other object routers of new location of the new replicates. 

26. The method of claim 25, further comprising : 

(vii).notifying a system administrator about the failed node being off-line. 

27. The method of claim 25, further comprising repeating steps ii-vi for all 
objects in the failed node. 

28. The method of claim 1, wherein when an object router from said at 
least one object router falls, said object router, that fails being called a failed object 
router, the computing platform responds using a process comprising: 

(I) utilizing services of object routers other than said failed object router; 

(II) replicating the failed object router to create a replicate object router; 

(lii) locating the replicate object router on at least one functional node ; and 
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(Iv) notifying at least a subset of the nodes of a new location of the replicate 
object router. 

29. The method of claim 28, further comprising: 

(v) notifying at least a subset of other object routers of the new location of 
the replicate object router. 

30. The method of claim 1, when said object router is aware of at least a 
subset of other object routers. 

31. A network computing platform comprising a plurality of nodes, said 
system comprising: 

at least an object router , 

wherein said object router Is capable of managing resources and distributing 
load evenly across the plurality of nodes, and 

wherein said object router maintains a knowledge of locations of at least a 
subset of all nodes in the computing platform. 

32. The platform of claim 31 wherein the object router relocates objects 
within the plurality of nodes. 

33. The platform of claim 31, wherein the object router maintains multiple 
instances of each object that is deployed In the platform. 
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34. The platform of claim 31 wherein each of said nodes comprise: 
an object melder that treats the plurality of nodes as a collection that 

performs melding to create said plurality of nodes and said at least one object 
router. 

35. The platform of claim 34, farther comprising an object versloner that 
maintains multiple versions of all objects within the platforms. 

36. The platform of claim 34, further comprising an object replicator that 
replicates an object to maintain a duplicate of the object in a node different from 
the node in which the object resided. 

37. The platform of claim 34, further comprising an object migrator that 
enables an object to migrate from one node to another. 

38. The platform of claim 34, further comprising a resource monitor. 

39. The platform of claim 34, further comprising a resource controller. 

40. The platform of claim 34 further comprising an application object. 

41. The platform of claim 34 wherein each of said nodes further comprise: 

a class loader that Is responsible for loading classes of objects, wherein the 

class loader permits multiple versions of a same class to co-exist. 
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42. The platform of claim 31 further comprising: 

an evolution module, wherein the evolution module allows the computing 
platform to perform evolution, wherein said evolution provides for transitioning of 
system software and hardware such that a seamless upgrade is possible. 

43. The platform of claim 31, wherein a failure of a first node from the 
plurality of nodes is detected by the platform when a proxy object from a second 
node is unable to reach the first node. 

44. The platform of claim 31, wherein a failure of a node from the plurality 
of nodes is detected by the platform when the node falls to respond to heartbeat 
requests from the object router. 

45. The platform of claim 31, wherein a failure of node from the plurality 
of nodes is detected by the platform when the node fells to report status or 
statistics to the object router. 

46. The platform of claim 31, wherein when a node from the plurality of 
nodes fails, said node being called a failed node, said at least one object router is 
capable of notifying other object routers other than said at least one object router. 
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47. The platform of claim 31, wherein when a node from the plurality of 
nodes falls, said node being called a failed node, said at least one object router is 
capable of notifying other nodes 

48. The platform of claim 31, wherein when a node from the plurality of 
nodes fells, said node being called a failed node, said at least one object router Is 
capable of making replicated object corresponding to at least a subset of objects 
residing In nodes other than the failed node primary. 

49. The platform of claim 31, when a node from the plurality of nodes 
fells, said node being called a failed node, said at least one object router is capable 
of locating additional computing resources within the computing platform and 
creating new replicates for newly created primary objects and locating them in 
nodes other than nodes in which they originally resided. 

50. The platform of claim 49, wherein the platform Is capable of notifying 
said other object routers of new location of the new replicates. 

51. The platform of claim 49, wherein the platform is capable of notifying a 
system administrator about the failed node being off-line. 

52. The platform of claim 31, wherein when an object router from said at 

least one object router falls, the computing platform Is capable of utilizing services 

of object routers other than said failed object router. 
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53. The platform of claim 31, wherein when an object router from said at 
least one object router falls ,the computing platform is capable of replicating the 
failed object router to create a replicate object router and locating the replicate 
object router on at least one functional node, 

54. The platform of claim 53, wherein the platform Is further capable of 
notifying at least a subset of the nodes of a new location of the replicate object 
router. 

55. The platform of claim 31, wherein said object router is aware of at 
least a subset of other object routers. 

56. An object router for a network computing platform comprising a 
plurality of nodes, wherein said object router is capable of managing resources and 
distributing load evenly across the plurality of nodes, and wherein said object router 
maintain a knowledge of locations of at least a subset of all nodes in the computing 
platform. 

57. The object router of claim 56, wherein the object router relocates 
objects within the plurality of nodes. 

58. The object router of claim 56, wherein the object router maintains 

multiple instances of each object that Is deployed in the platform. 
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59. The object router of claim 56, wherein when a node from the plurality 
of nodes fails, the object router is capable of notifying other object routers other 
than said object router. 

60. The object router of claim 56, wherein when a node from the plurality 
of nodes falls, the object router Is capable of making replicated object 
corresponding to at least a subset of objects residing in nodes other than the failed 
node primary. 

61. The object router of daim 56, wherein when a node from the plurality 
of nodes falls, the object router Is capable of locating additional computing 
resources within the computing platform and creating new replicates for the new 
primary objects and locating them in nodes other than nodes In which they 
originally resided. 

62. The object router of claim 56, wherein the object router is aware of at 
least a subset of other object routers. 

63. A computer program product Including a computer-readable medium 
comprising Instructions to enable a computer to Implement a method for creating a 
scalable, fault tolerant computing platform on a network, said platform comprising 
a plurality of nodes, said Instructions comprising Instructions for: 
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a) initiating melding during Initialization, wherein during said melding said 
plurality of nodes are treated as a collection and each node from said plurality of 
nodes joins said collection, and during melding at least a first object router is 
assigned; 

b) migrating objects between said plurality of nodes; and 

c) replicating objects across said plurality of nodes. 



64. The program product of claim 63, wherein during melding each node 
joins the collection using a process comprising: 

(a)(i) the node finding a nearby object router If the object router can be 
found, otherwise becoming the object router; and 

(a)(ii) the node reporting available resources to the object router. 

65. The program product of claim 63, wherein during melding each node 
joins the collection using a process further comprising: 

(a)(IIi) assigning network address for the node; and 
(a)(Iv) updating components of the node. 

66. The program product of claim 64, wherein the available resources 
comprise memory, at least one processing unit and drive storage space. 

67. The program product of claim 63, wherein the available resource 
comprises network band with. 
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68. The program product of claim 64, wherein said updating Is done using 
data stored In a code version server; 

69. The program product of claim 64, wherein it is ensured that at least 
one mirror of an object router exists all the time. 

70. The program product of claim 63, wherein the object router manages 
resources and distributes load evenly across the plurality of nodes. 

71. The program product of claim 63, wherein the object router maintains 
a knowledge of locations of at least a subset of all nodes in the computing platform. 



72. The program product of claim 63 wherein the object router relocates 
objects within the plurality of nodes. 



73. The program product of claim 63, wherein said migration of an object 
is performed using a process comprising: 
(b)(1) locking the object; 

(b)(ii) converting the object to a byte stream at a first node in the plurality of 
nodes; 

(b)(IH) transferring the byte stream across the network; 

(b)(lv) converting the byte stream back to the object at a second node In the 

plurality of nodes; and 

(b)(v) updating the location of the object In the object router; and 
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74. The program product of claim 63, wherein said replication of an object 
Is performed by a process comprising: 

(c)(1) locking the object at a first node; 

(c)(ii) forming a duplicate object; 

(c)(iii) locating the duplicate object at a second node; 

(c)(iv) subscribing to events associated with the object; and 

(c)(v) unlocking the object. 

75. The program product of claim 74, wherein said duplication is 
performed by creating a byte stream, from the object and converting said byte 
stream Into a duplicate object. 

76. The program product of claim 63, wherein the computing platform 
allows for evolution, wherein said evolution provides for transitioning of system 
software and hardware such that a seamless upgrade Is possible. 

77. A computer program product including a computer readable medium 
comprising instructions for enabling a computer to perform evolution, said evolution 
being performed to enable a seamless transitioning of software, said Instruction 
including instructions for: 

evaluating a software or hardware to be Installed or upgraded; 
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(b) creating a dependency hierarchy tree of all objects In the computing 
platform at compile time; 

(c) performing object class diff comparison between old and new versions of 

at least one class of objects; 

(d) creating a transform function for each state variable that Is changed; 

(e) selecting an object that Is an Instance of the selected object class; 

(f) locking the selected object; 

(g) creating uninitialized instance of the new version of the selected object; 

(h) initializing varialbles of the new object class instance by applying 
appropriate transform functions to the old object version; 

(i) unlocking the selected object; and 

(j) repeating steps e-i for each Instance of the object class. 

78. The program product of claim 77, further comprising: 

(I) repeating steps d-j for each object class In the computing platform. 

79. The program product of claim 77, further comprising: 
(m) deploying client-side software. 

* 

80. The program product of claim 77, further comprising: 

(n) destroying older version of the objects If multiple versions of the objects 
are not required. 

81. The program product of claim 63, wherein the object router maintains 

multiple Instances of each of said objects. 
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82. The program product of claim 63, wherein a failure of a first node from 
the plurality of nodes is detected when a proxy object from a second node is unable 
to reach the first node. 

83. The program product of claim 63, wherein a failure of a node from the 
plurality of nodes is detected when the node fails to respond to heartbeat requests 
from the object router. 

84. The program product of claim 63, wherein a failure of node from the 
plurality of nodes Is detected when the node fails to report status or statistics to the 
object router. 

85. The program product of claim 63, wherein when a node from the 
plurality of nodes fails, said node being called a failed node, said at least one object 
router responds using a process comprising: 

(i) notifying other object routers other than said at leastone object router; 
(H) selecting an object that resided in the failed node; 
(III) making replicated object corresponding to the selected object residing in 
nodes other than the failed node primary; 

(iv) locating additional computing resources within the computing platform; 

(v) creating new replicates for the newly created primary objects in step Hi 
and locating them in nodes other than nodes In which they originally resided; and 

(vi) notifying said other object routers of new location of the new replicates. 
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86. The program product of claim 85, further comprising: 

(vli) notifying a system administrator about the failed node being off-line. 

87. The program product of claim 85, further comprising repeating steps 
ii-vl for all objects In the failed node. 

88. The program product of claim 63, wherein when an object router from 
said at least one object router fails, said object router that falls being called a failed 
object router, the computing platform responds using a process comprising: 

i) utilizing services of object routers other than said failed object router; 
il) replicating the failed object router to create a replicate object router; 
Hi) locating the replicate object router on at least one functional node ; and 
lv) notifying a subset of the nodes of a new location of the replicate object 
router. 

89. The program product of claim 63, when said object router Is aware of 
at least a subset of other object routers. 
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